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Abstract. We consider the A-Greedy Algorithm and the Dual 
Greedy Algorithm in a finite-dimensional Banach space with a 
strictly monotone basis as the dictionary. We show that when 
the dictionary is an initial segment of the Haar basis in L p [0, 1] 
(1 < p < oo) then the algorithms terminate after finitely many 
iterations and that the number of iterations is bounded by a func- 
tion of the length of the initial segment. We also prove a more 
general result for a class of strictly monotone bases. 

1. Introduction 

Greedy algorithms in Hilbert space are known to have good conver- 
gence properties. The first general result in this direction was obtained 
by Huber j6], who proved convergence of the Pure Greedy Algorithm 
(PGA) in the weak topology of a Hilbert space H and conjectured that 
the PGA converges strongly in H. Huber's conjecture was proved by 
Jones [7]. 

Our interest in this paper is in convergence results for greedy al- 
gorithms in a Banach space X (see [I2])- We say that T> C X is a 
dictionary if the linear span of T> is norm-dense in X and ||<^|| = 1 for 
all ip G V. (Usually, but not here, V is also assumed to be symmetric.) 
For some of the algorithms that have been proposed, e.g. the Weak 
Chebyshev Dual Greedy Algorithm [TU [2] or the Weak Greedy Algo- 
rithm with Free Relaxation [13J, it is known that uniform smoothness 
of X guarantees strong convergence of these algorithms for an arbi- 
trary dictionary T>. Rate of convergence results have also been proved 
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We are mainly concerned with two natural generalizations of the 
PGA to the Banach space setting, namely the X-Greedy Algorithm 
(XGA) and the Dual Greedy Algorithm (DGA) (see [12]). These algo- 
rithms generate a sequence of greedy approximants (G n ) to an initial 
vector x. The updated approximant G n+ i is obtained from G n by best 
one-term approximation of the residual x — G n in the direction of a 
particular dictionary element ip n G T> which satisfies a certain selection 
criterion. Precise definitions will be given below. 

Livshits [8j constructed a dictionary in a smooth Banach space for 
which the XGA fails to converge. No general convergence results for the 
strong topology are known for the XGA and the DGA for the class of 
uniformly smooth Banach spaces. In [3] convergence was proved (for 
an arbitrary dictionary) for the weak topology in uniformly smooth 
Banach spaces with the so-called WN Property. In particular, weak 
convergence was proved in uniformly smooth Banach spaces which are 
uniformly convex and have a 1-unconditional basis. Unfortunately, 
L p [0, 1] (p 7^ 2) does not enjoy the WN Property, so these results 
cannot be applied to L p [0, 1]. 

An important advance was made by Ganichev and Kalton [I] who 
proved strong convergence of the DGA in L p [0, 1] for an arbitrary dic- 
tionary. More precisely, they introduced a geometrical property called 
Property T, proved strong convergence of the DGA in Banach spaces 
with Property T, and showed that all subspaces of quotient spaces of 
L p [0, 1] (1 < p < oo) enjoy Property T. In [5] property T was char- 
acterized via the notion of a 'tame' convex function, and using this 
characterization several other important spaces were shown to enjoy 
Property T. 

The arguments used by Ganichev and Kalton do not seem to yield 
convergence results for the XGA. In particular, convergence of the XGA 
in L p [0, 1] is an open question. This is surprising because the XGA 
yields the best one-term approximation at each step. Even for the 
important special case of this problem in which the dictionary is the 
Haar basis of L p [0, 1] very little seems to be known. 
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Problem 1.1. Suppose that the dictionary is the Haar basis in L p [0, 1] 
(p ^ 2). Does the XGA converge strongly to the initial vector x? Does 
it converge in the weak topology? 

We attacked the finite-dimensional analogue of this problem and 
obtained the following theorem, which is a corollary of our main result 
(Theorem 13.61 below) . 

Theorem 1.2. Let 1 < p < oo and let (h^)°l be the normalized 
Haar basis for L p [0, 1]. Then, for each m > 0, there exists a positive 
integer N(p,m) such that, for the dictionary (h^)^, the XGA and 
DGA terminate in at most N(p, m) iterations for every initial vector 
in the linear span of {h!f > ) 7 ^ =0 . 

We present an example of a non-monotone basis of the two-dimensional 
Euclidean space for which the XGA does not terminate. When the dic- 
tionary is a strictly monotone finite basis we show that for every initial 
vector the XGA and DGA terminate after finitely many iterations. To 
get a uniform bound on the number of iterations that is independent of 
the initial vector, as in Theorem 11.21 we isolate a particular property 
(Property P) of the Haar basis and prove the existence of a uniform 
bound for all strictly monotone bases with Property P. 

The paper is organized as follows. The greedy algorithms which we 
consider are defined in the next section. Our main result is proved in 
Section [31 The final section contains two estimates for the Haar basis 
which lead to a refinement of Theorem 11.21 in the range p > 2. 

2. Definitions and Notation 

First we recall some notation and terminology from Banach space 
theory. We denote the unit sphere {x G X : \\x\\ = 1} of X by Sx- 
We say that F x G X* is a norming functional for a nonzero x G X 
when H-Fzllx* = 1 and F x (x) = \\x\\] by the Hahn-Banach theorem, 
each x G X has at least one norming functional. X is smooth if F x is 
unique. 

It is known that the norm of a smooth finite-dimensional Banach 
space is uniformly Frechet differentiable, i.e. 

(1) \\x + y\\ = l + F x (y)+e(x,y)\\y\\ 
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for all x, y G X with ||x|| = 1, where e(x,y) — > uniformly for (x, y) G 
Sj x I as ||y|| — > 0. 

A basis (e i )^ 1 of an m-dimensional Banach space X is said to be 
strictly monotone if 

II y^a^H < II ^2^11 

i=i i=i 

for all 1 < «o < m an d (°«) C R with equality only if a« = for 
i = «o+l, . . . , m. The dual basis (e*)™ =l C X* is defined by e*(ej) = 5ij. 
The basis is normalized if ||ej|| = 1 for i = 1, . . . , m. Note that if (ej)™ x 
is a normalized monotone basis then for all (a*) C K, we have 

j mm 

(2) - max H < H^a^H < ^|o,|. 

i=l i=l 

Let us recall the definition of the Haar basis functions defined on 
[0,1]. Let h = 1. For n > and < fc < 2™, we define hi for 
i = 2™ + fc thus: 

{1 on [k/2 n , (2k + l)/2™ +1 ) 
-1 on [(2k + l)/2™ +1 , (k + l)/2 n ) 
elsewhere. 

The Haar basis is a strictly monotone basis of L p [0, 1] (equipped with 
its usual norm || • || p ) for 1 < p < oo. 

The algorithms which we consider in this paper all arise from the 
repeated application of a greedy step to a nonzero residual vector y G 
X. Let us describe the general form of this greedy step. 

(i) Select tp(y) G T> by applying a selection procedure (which de- 
pends on the particular algorithm in question) to y. In gen- 
eral the selection procedure will allow many possible choices for 

<p(y)- 

(ii) Then select X(y) G R to minimize \\y — \<f>(y)\\ over A. 

Starting with an initial vector x G X, we generate a sequence of 
residuals (x n ) as follows. 

(i) Set x := x. 

(ii) For n > 1, apply the greedy step to the residual y = x n -i to 
obtain ip n := ip(x n _i) G V and A n := A(x„_i) G R. 
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(iii) Set x n := x n -i — \ n (p n to be the updated residual. 

The algorithm is said to converge (strongly) if \\x n \\ — > as n — > oo. It 
is said to terminate after N steps if xjv = 0. For n > 1, the n th greedy 
approximant is defined by G n = Y17=i 'Wi- Note that G n = x — x n and 
that x = Y^iLi ^ifi ( re sp- x = X/ili -^V 9 *) ^ ^e algorithm converges 
(resp. terminates after iV steps). 

Two important greedy algorithms of this type are the weak X -Greedy 
Algorithm (WXGA) and the Weak Dual Greedy Algorithm (WDGA) 
(see |12j). In both cases a weakness parameter r G (0, 1) is specified in 
advance. For the WXGA with weakness parameter r the greedy step 
is as follows. Given a nonzero x G X, we select (p G T> to satisfy 

(3) llxll — min llx — A<z>(x)|| > rf llxll — inf llx — \cp\\ ). 

AeR V AeR 7 

We can also set r = 1 in the above when it can be shown that the 
infimum in ([3]) is attained, e.g. if T> is finite or if "D is a monotone basis 
for X; the case r = 1 is the X-Greedy Algorithm (XGA) discussed in 
the Introduction. 

For the WDGA with weakness parameter r the greedy step is as 
follows. Given a nonzero y G X, choose <p(y) G T> such that 

|F^(y))|>rsup|F^)|. 

The case r = 1, when it makes sense, is the Dual Greedy Algorithm 
(DGA) discussed in the Introduction. Smoothness of X guarantees 
that the residuals satisfy ||x n || < ||x n _i|| for both the WXGA and the 
WDGA. 



3. Main Results 

Proposition 3.1. Suppose that X is a finite- dimensional smooth Ba- 
nach space. Then there exists 7 G (0, 1) such that the greedy steps of 
both the WXGA and WDGA applied to any nonzero y G X satisfy 

( 4 ) \\y- Hv)<p{v)\\ < tIMI- 

Proof. First we consider the WDGA with weakness parameter r. By 
compactness of Sx and continuity of the mapping y — >• F y , there exists 
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5 > such that 

su P {\F y ^)\:4>eV}>5 (yeS x ). 

Hence, the WDGA applied to y G Sx selects (p(y) G V such that 
\F y ((p(y))\ > t5. By uniform Frechet differentiability of the norm there 
exists r] > such that for all y G Sx and for all z G X with \\z\\ < i], 
we have \e(y, z)\ < t5/2 in flTJ, and hence 

\\y - ^11 = 1 - F v( z ) + e (vi ~ z )\\ z \\ 

< 1 - Fy(z) + 

Setting z = ±rj(p(y) for the appropriate choice of signs yields F y (z) > 
t]t5, and hence 

||y-^||<l-^. 
\\y ii - 2 

By homogeneity we get for all nonzero i/£X 

(5) ||y-A(yMy)||<(l-^)|M|. 

Setting t = 1 in the above yields an estimate for the DGA. Since the 
greedy step of the XGA produces a residual with the smallest norm, it 
follows that the same estimate must also hold for the XGA. But this 
implies that (jSJ) also holds for the WXGA with parameter r. 

We turn now to consider the case in which X is m-dimensional (1 < 
m < oo) and the dictionary is a strictly monotone normalized basis 
B = (e^)™ x for X. We shall say that the algorithm is norm-reducing 
with constant 7 (0 < 7 < 1) if (j4j) holds for the greedy step. 

Proposition 3.2. Suppose that the algorithm is norm-reducing with 
constant 7. Then, for each initial vector x G X, the algorithm termi- 
nates after finitely many steps. 



Proof. The proof is by induction on m. The result is trivial if m = 
1, so suppose m > 1 and that x = YlT=i a * e «- ^ a m = 0, then by 
monotonicity of B the algorithm will never select e m , so the result 
follows by induction. So suppose that a m 7^ 0. If the algorithm selects 
e m at the n th step, then by strict monotonicity the new residual x n 
satisfies e* m (x n ) = 0, i.e. the last coefficient is set equal to zero, and 
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the result follows by induction. Thus to conclude the proof it suffices 
to show that e m is eventually selected. But if e m is never selected then 
e m( x n) = flm for all n > 1, so by (J2J) 

nil ll ^ ll ll \ i*/ M \ \Q"m\ 

7 IfII > IK II > o^ax le^Xn)] > — , 

Z l<i<m Z 

which is a contradiction when n is larger than ln(2||a;||/|a m |)/ ln(7 _1 ). 

Example 3.3. Monotonicity of the basis is essential. Indeed, consider 
the basis B = {(1,0), (I/a/2, 1/v2)} of 2- dimensional Euclidean space. 
It is easily seen that the XGA does not terminate unless the initial 
vector is a multiple of one of the basis vectors. 

Problem 3.4. The estimate n < ln(2||x||/|a m |)/ln(7 _1 ) for the num- 
ber of steps before the algorithm terminates clearly depends on x and 
becomes unbounded as a m — > 0. Is there a uniform bound N which is 
independent of the initial vector x7 

We shall now provide a sufficient condition which guarantees a pos- 
itive answer to this question. Then we verify that the initial segments 
of the Haar basis satisfy this condition. 

Definition 3.5. Let B = (e,)^ be a normalized monotone basis for 
X. We say that B has Property P with constant ( > if the following 
condition is satisfied: for all x = Y^Li a i e i X and for all 1 < z'o < 
m — 1, we have 

m 

l^o I < C 1°*!' 

i=io+\ 

where t minimizes the mapping 1 1— > || aiei + te io + Y^Li +i a i e i\\- 

Now we can state our main result. 

Theorem 3.6. Suppose that X is m-dimensional, that B is a strictly 
monotone basis for X which has Property P with constant (, and that 
the algorithm is norm-reducing with constant 7. Then there exists a 
positive integer N(m, 7, () such that the algorithm terminates in at 
most N steps for every initial vector x G X . 

The proof of Theorem l3.6l requires some combinatorial notation which 
we shall now describe. For positive integers r and s, with r < s, the 
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integer interval {n G N: r < n < s} will be denoted by [r, s]. If p and 
I 2 are integer intervals we write p < P if maxp < min/i, and we say 
they are consecutive if minP = max/2 + 1- 

For 1 < k < m, an interval partition of [l,m] is a /c-tuple P = 
( Ji, . . . , p) of consecutive integer intervals p, . . . , p such that min p = 
1, max/i = m, and p < p_i < ■•• < p. The collection Vim) 
of all interval partitions of [l,m] is readily seen to have cardinality 
2 m_1 . We endow Vim) with the lexicographical ordering -<, i.e., if 
P 1 — (Ii, . . . , I r ) and P 2 = ( Ji, . . . , J s ) are two interval partitions then 
Pi -< P2 if, for some t > 1, we have card/ u = card J u for 1 < u < t 
and card/ t < card J t . Note that ([l,m]) is the maximum element of 



P(y) = (Ji, . . . , ifc) G P(m) by 'backwards induction' as follows: 

(i) m G P; 

(ii) Suppose that 1 < i < m and that i + 1 G Ij. Then 



It may be helpful to explain the intuition behind this definition. The 
definition of P(y) begins with I 1 . Working backwards from % — m G ii, 
then % is placed in the same interval Ij as i + 1 if the coefficient |aj| 
is not too much larger (roughly speaking) than the later coefficients 
|aj + i|, . . . , \a m \. But if I Oj| is much larger than the later coefficients 
then a new interval is begun for which i = max/ J+1 . Note that 






T=l I max I r I j 



m 




i=l 



k 




(7) 



k 



rn 



<(^|a m ax/ 3 1) $^(1 + C) 



m—i 



j=l i=l 



< m max |a max7j .|. 



i<i<fc 
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Lemma 3.7. For each initial vector y G X with P(y) = (h, . . . , /&) 

there exists io G {max/,-: 1 < j < k} such that the algorithm selects 
ei in at most n steps, where 

W <r, i | ln(2m(l + Cr/C) , 

(8) no£1 + L him J - 

Proof. Let i be defined by 

|dj | =max{|aj|: i G {max/,-: 1 < j < k}}. 

Suppose that e io is first selected at the (n ) th step. Then the residual 
y no -i satisfies by ([2D and ([7J 



2 ll^'f-U-MI ' 1 1 & 1 1 I £ I <-U W 

and the result follows. 

Lemma 3.8. Suppose that when applied to y the algorithm selects 
ei and produces a residual z. Let P(y) = (ii, . . . , /&) and P(z) = 
(Ji, . . . , Ji). Then either i = m or 

-< P(z) ifi G {max Ij : 2 < j < k}, 
P{z) otherwise. 

Proof. We may assume that z'o < m. Suppose that io + 1 G J J0 . Let 
2/ = XXi a » e * and z = Ya=i b i e i- Clearly, b { = a { if i ^ i . Thus by (jSJ), 
Jj = Jj for j < jo and max Jj = max 7^. Since I? has Property P with 
constant (, and using the estimate |<2j| < (1 + C) m ~ l (5^Li l a max/j|) f° r 
z > io which follows from (jSJ), we get 

rn 

\K \ < C l ai l 

i=«0+l 

m jo 

i=*o+l i =1 
jo 

<(i+cr- io (^|6 maxJ3 .|). 
j=i 

Thus, by Q, i G Jj . In particular, if i ^ Jj (in which case io = 
max/j 0+1 ), then card(Jj ) > card(/ J0 ), so P(y) -< P(z). On the other 
hand, if io G Ij , then using the facts that bi = a« if i ^ io and that 
io 7^ max J,,,, it follows again from (jSj) that P(y) = -P(z)- 
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Proof of Theorem VJ.bX The proof is by induction on m. Let x G X. We 
may assume that e* m (x) ^ 0. It suffices to give a bound independent of 
x for the number of steps required for the algorithm to select e m . Let 
P(x) = (Ji, . . . , Ik)- Then by Lemma l3"7Tl the algorithm selects either e m 
or Ci , where %q e {max Ij : 2 < j < k}, in at most no steps. In the latter 
case, by Lemma [3751 P(x) -< P(x no ). Repeating the argument with x 
replaced by x m , we find that either e m is selected in the first 2n steps or 
P(x no ) ~< P(x 2no ). After a total of at most card(P(m)) - 1 = 2 m - x - 1 
iterations of this argument, we find that either e m is selected in the first 
(2 m_1 — 1)tiq steps or P(x( 2 m-i_ 1 ) no ) = ([l,m]), the maximum element 
of Vim). In the latter case, by Lemma [3.71 e m will be selected in at 
most a further uq steps. In conclusion, e m will be selected in at most 
2 m ~ 1 n steps. This leads to the estimate 

in 

(9) N(m, 7 , C)=n J2 ^ = ^ ~ l W 

8=1 

Our next goal is to show that all initial segments of the Haar basis 
for L p [0, 1] (1 < p < oo) have property P with constant ( depending 
on m and p. In the next section we prove that if p > 2 then ( may be 
chosen independently of m. 

Lemma 3.9. Let 1 < p < oo and let hf^ = hi/\\hi\\ p (i > 0). For 
each m > 1 there exists a positive constant C(m,p) such that, for all 
M eK, if \a±\ > C{m,p) Y^=2 l a jl> then 

m m 

(10) ||M + ^aA (p) || P > ||M + ^aA (p) || P . 

i=l i=2 

Proof. If M = we can take C(m,p) = 2 by an easy triangle inequality 
calculation. If M ^ then by homogeneity of the norm we may assume 
that M — 1. By expanding in a Taylor series, we see that there exist 
positive constants b±, . . . , b m such that 

||1 + £> i /i < (p) ||£= / \1+J2 a ih?\ p dt 

m m 

= 1 + +o(5> 4 2 ). 

1=1 8=1 
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Thus there exists < e < 1 such that if |ai| = Y1T=2 l a *l < £ then jTO 
is satisfied. By convexity of the mapping 



t ^\\ 1+th & + J2 a . h ( 



\pi 

i=2 



it follows that fllOp is also satisfied whenever l a *l < e an< ^ l°il — 

T1T=2 \ a i\- Now suppose that Y1T=2 l a *l — £ - ^ 

m m 

|ai| > (2 + 2/e)^|a<| >2 + 2^|a;|, 

i=2 i=2 

then by the triangle inequality 

m m 

»! 



i=2 i=2 
m 



||1 + X>AHI P > | ai |-l-^| fli | 

8=1 1=2 

m m 

> 2 + 2^1^1 - 1-J2\ a i 

i=2 
m 

= l + ^|ai| 

i=2 
m 

> Ill + ^Oi/^V 

8=2 

Thus, C(m,p) = 2 + 2/e works. 



Proposition 3.10. Let 1 < p < oo. For eac/i m > 1, the initial 
segment {h!f )™q of the Haar basis for L p [0, 1] has property P with 
constant ( = C(m,p). 

Proof. Let < i < m. Suppose to minimizes the function 

io—l m 

t^\\J2a i h ( f ) +thl+ «A (P) H 

i=0 i=io+l 

for fixed coefficients (a*) C E. Suppose that /i« is supported on 
the dyadic interval I and let M be the (constant) value assumed by 
Silo 1 Q-ih^ on /. Then to minimizes the function 

/m 
\M + th% ) + *ih\ p) \ p dx. 
i=%o+l 
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Lemma [3.91 obviously transfers from [0, 1] to /. So 

m 

\t \<C(p,m)J2 



m 



Note that in view of the preceding result the initial segments of the 
Haar basis in L p [0, 1] satisfy the hypotheses of Theorem 13.61 Thus, 
Theorem 11.21 is a special case of Theorem 13.61 

4. Further Results 

In this section we present some more precise estimates for the Haar 
basis. First we estimate the norm-reducing constant 7. Then we show 
that for p > 2 the constant ( for Property P may be chosen to be 
independent of m. 

Recall that the modulus of smoothness px{t) of a Banach space X 
is defined for < t < 1 by 

(+\ / W x + y\\ + \\ x ~y\\ 1 c y H ii i ii .\ 

p x {t) = supj 1 : x,y <E X, \\x\\ = 1, \\y\\ = tj 

(see (HI p. 59]). The modulus of smoothness for L p [0, 1] satisfies 



PL p [0,l]{t) < 



c p t p if 1 < p < 2, 
c p t 2 if 2 < p < 00, 



7< 



where c p is a constant (see [9J p. 63]). 

Proposition 4.1. Suppose thatm > 1 and that 4CN has cardinality 
m. FotT>a '■= {h^)ieA and Xa '■= spanT^ C L p [0, 1] we have that the 
DGA and XGA are norm-reducing with constant 

1 - c' p m p ^ 2 - 2 ^ ifl<p<2, 
1 - c' p m {2 ~ 2p V p if2<p<oo, 

where c' p is a constant depending only on p. 

Proof. The XGA produces the greatest norm reduction at each step, so 
it suffices to prove the result for the DGA. For convenience let c denote 
a constant depending only on p whose precise value may change from 
line to line. First we consider the case 1 < p < 2. Let y = ^2 ieA dih/f G 
Sx A and let F y = ^2 i( z A bih^f 1 6 Sx* A , where q = p/(p — 1). Note that 
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The Haar basis in L q [0, 1] satisfies an upper 2-estimate for q > 2 (see 
[1]). Thus, Ylii^A \bi\ 2 > c i an d since card A = m we get 

c 

We may assume that 6j > 0. Thus, for t > 0, we have 
III/ + thffWj, > F y (y + thg) = 1 + tb i0 > 1 + 



\b io \ : = max 1 6, : | > 



Hence 



|j/ - th!fX < 2 - ||y + th%>\\ P + 2p Lpm (t) 
< 2 - (1 + —) + 2c p t p 



1 - -p= + 2c„t p . 



Choosing t to minimize 1 — {ct/ y/m) + 2c p t p yields 7 < 1 — cm p ^ 2 ~ 2p \ 
The case p > 2 is proved similarly using the fact that the Haar basis 
in L q [0, 1] satisfies an upper g-estimate for q < 2. 

Proposition 4.2. Suppose that 2 < p < 00. Then for ally £ span(/ij)°^ 2 , 
we /jai>e 

||1 + > || 1 +y|| p 

provided \t\ > max(4, 2 ( - p ~ 3 ^ 2 y/p(p — 1)). 

Proof. If || y ||p > 1 then the result holds for |i| > 4 by the triangle 
inequality. So assume \\y\\ p < 1. For p > 2, /(x) = |x| p is twice 
differentiable. Thus, by the Mean Value Theorem, for all x £ R there 
exists < 6>(:r) < 1 such that 

|1 + x \ p = 1 +px + ^ x 2 |l + g(x)x| p ~ 2 . 

Thus, for all y £ span(/ij)^ 2 with ||y|| p < 1, we have 

f 1 1 + y{s) \ p ds < 1 + p f y{s) ds + t y{s) 2 \l + \y(s) 1 r 2 

Jo Jo * Jo 

= 1 + + ^=-^ C y(s) 2 \i + \y(s)\r 2 ds 



Pip- 1) ■■ fM 2 m , unp-2 
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(by Holder's inequality for the conjugate indices p/2 and p/(p — 2)) 



<i+2- 2 (^^)imi;, 

using the fact that \\y\\ p < 1 in the last line. Hence 

(11) 111 +2/11, < (1 + 2^^-1)11^1©^. 
On the other hand, since p > 2, we have 

||1 + t\\y\\ p h? + y\\ p > ||1 + t\\y\\ p h^ + y\\ 2 

(12) >||l + t||y|| P M P) || 2 

= (i+*W /2 . 

Combining (TTTT) and (TT21 yields the result. 

Corollary 4.3. Let 2 < p < oo. Every finite subsequence of the Haar 
basis for L p [0, 1] has property P with constant 

C = max(4, 2^/VKp-I))- 

Combining Proposition 14.11 with Corollary 14. 3[ and using the esti- 
mates (jHJ) and ([9]) for the number of iterations, yields the following 
strengthening of Theorem 11.21 in the range p > 2 in which the initial 
segment of the Haar basis of length m is replaced by any subset of 
cardinality m. 

Theorem 4.4. Let 2 < p < oo and let m > 1. Then, for all A C N of 

cardinality m, the XGA and DGA terminate in at most 0(2 m mlnm) 
iterations for the dictionary T>a and for every initial vector in 

Remark 4.5. We do not know whether or not the last result holds also 
for 1 < p < 2. 
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